Recognizing speech in voice messages

نویسندگان

چکیده

The level of development information technology makes it possible to use speech recognition technologies in a wide range human life and activities. It is very convenient the voice interface: search for necessary documents, dialing phone number, managing IOT devices, navigation, simple text dictation. Since natural language interface provides an additional convenience person when typing, sending messages has become common among users. In this case, are audio files. But not always available recipient listen such messages. This problem can be solved with help automatic system (ASR). article describes stages elements process processing by signal. Modern problems choosing them indicated. (ASR) systems understand fully spontaneous that natural, memorized, contains signs stuttering or even minor errors. At same time, they still too expensive develop from scratch. So companies faced choice between using cloud API ASR developed tech giants open source solutions. analysis latest research publications on data considered. A software solution conversion into proposed. signal delivery proposed made as chat bot messenger. presents main components system, algorithm bot, modern development, implementation configuration messenger

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Coding schemes for time encoded speech (TES) voice messages

This paper reports an an investigation into the use of Time-Encoded Speech (TES) [1] for the economical storage of digital voice messages in the tactical military arena. Initial results indicate that bit rate· reductions of between 20°/o and 55°/o may be available using simple coding schemes. INTRODUCTION The need for an economical digital description of the human speech waveform for applicatio...

متن کامل

Recognizing Uncertainty in Speech

We address the problem of inferring a speaker’s level of certainty based on prosodic information in the speech signal, which has application in speech-based dialogue systems. We show that using phrase-level prosodic features centered around the phrases causing uncertainty, in addition to utterance-level prosodic features, improves our model’s level of certainty classification. In addition, our ...

متن کامل

Recognizing emotion in speech

This paper explores several statistical pattern recognition techniques to classify utterances according to their emotional content. We have recorded a corpus containing emotional speech with over a 1000 utterances from different speakers. We present a new method of extracting prosodic features from speech, based on a smoothing spline approximation of the pitch contour. To make maximal use of th...

متن کامل

Recognizing Sloppy Speech

As speech recognition moves from labs into the real world, the sloppy speech problem emerges as a major challenge. Sloppy speech, or conversational speech, refers to the speaking style people typically use in daily conversations. The recognition error rate for sloppy speech has been found to double that of read speech in many circumstances. Previous work on sloppy speech has focused on modeling...

متن کامل

Recognizing Speech from Sim

In this paper we present and evaluate factored methods for recognition of simultaneous speech from multiple speakers in single-channel recordings. Factored methods decompose the problem of jointly recognizing the speech from each of the speakers by separately recognizing the speech from each speaker. In order to achieve this, the signal components of the target speaker in each case must be enha...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Vìsnik Priazovs?kogo deržavnogo tehnì?nogo unìversitetu

سال: 2022

ISSN: ['2519-271X', '2225-6733']

DOI: https://doi.org/10.31498/2225-6733.45.2022.276225